Multi-Kepler GPU vs. multi-Intel MIC for spin systems simulations
نویسندگان
چکیده
We present and compare the performances of two many-core architectures: the Nvidia Kepler and the Intel MIC both in a single system and in cluster configuration for the simulation of spin systems. As a benchmark we consider the time required to update a single spin of the 3D Heisenberg spin glass model by using the Over-relaxation algorithm. We present data also for a traditional high-end multi core architecture: the Intel Sandy Bridge. The results show that although on the two Intel architectures it is possible to use basically the same code, the performances of a Intel MIC change dramatically depending on (apparently) minor details. Another issue is that to obtain a reasonable scalability with the Intel Phi coprocessor (Phi is the coprocessor that implements the MIC architecture) in cluster configuration it is necessary to use the so-called offload mode which reduces the performances of the single system. As to the GPU, the Kepler architecture offers a clear advantage with respect to the previous Fermi architecture maintaining exactly the same source code. Scalability of the multi-GPU implementation remains very good by using the CPU as a communication co-processor of the GPU. All source codes are provided for inspection and for double-checking the results.
منابع مشابه
Optimizing the Monte Carlo Neutron Cross-section Construction Code, Xsbench, to Mic and Gpu Platforms
XSBench is a proxy application developed by Argonne National Laboratory (ANL). It is used to study the performance of nuclear macroscopic cross-section data construction — usually the most time-consuming process in Monte Carlo neutron transport simulations. In this paper we report on our experience in optimizing XSBench to Intel multi-core CPUs, Many Integrated Core coprocessors (MICs) and Nvid...
متن کاملEvaluating multi-core and many-core architectures through accelerating the three-dimensional Lax-Wendroff correction stencil
Wave propagation forward modeling is a widely used computational method in oil and gas exploration. The iterative stencil loops in such problems have broad applications in scientific computing. However, executing such loops can be highly time-consuming, which greatly limits their performance and power efficiency. In this paper, we accelerate the forward-modeling technique on the latest multi-co...
متن کاملStackless Multi-BVH Traversal for CPU, MIC and GPU Ray Tracing
Stackless traversal algorithms for ray tracing acceleration structures require significantly less storage per ray than ordinary stack-based ones. This advantage is important for massively parallel rendering methods, where there are many rays in flight. On SIMD architectures, a commonly used acceleration structure is the multi bounding volume hierarchy (MBVH), which has multiple bounding boxes p...
متن کاملMonte Carlo Simulations of Spin Systems on Multi-core Processors
We implement Monte Carlo algorithms for the simulation of spin-glass systems and optimize our codes for recent multi-core CPU and GPU architectures. We consider both the Ising (binary) and Heisenberg (floating-point) spin-glass models. We provide performance figures for the Intel Nehalem and the IBM Cell/BE CPUs and the Nvidia Tesla C1060 GPU; we also draw a comparison with the performance of d...
متن کاملVisual Exploration of Data with Multithread MIC Computer Architectures
Knowledge mining from immense datasets requires fast, reliable and affordable tools for their visual and interactive exploration. Multidimensional scaling (MDS) is a good candidate for embedding of high-dimensional data into visually perceived 2-D and 3-D spaces. We focus here on the way to increase the computational performance of MDS in the context of interactive, hierarchical, visualization ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computer Physics Communications
دوره 185 شماره
صفحات -
تاریخ انتشار 2014